On the Usage of Kappa to Evaluate Agreement on Coding Tasks
نویسنده
چکیده
In recent years, the Kappa coefficient of agreement has become the de facto standard to evaluate intercoder agreement in the discourse and dialogue processing community. Together with the adoption of this standard, researchers have adopted one specific scale to evaluate Kappa values, the one proposed in (Krippendorff, 1980). In this position paper, I highlight some issues that should be taken into account when evaluating Kappa values. Finally, I speculate on whether Kappa could be used as a measure to evaluate a system’s performance.
منابع مشابه
Diagnostic concordance among dermatopathologists in basal cell carcinoma subtyping: Results of a study in a skin referral hospital in Tehran, Iran
Background: Basal cell carcinomas (BCC) are the most prevalent among non-melanoma skin cancers (NMSC), which correspond to the most common skin cancers. BCC histopathological subtyping is a problem in therapeutic management. Therefore, we have decided to perform a histopathologic study for better classification of BCCs based on interobserver diagnostic judgment. Methods: We conducted this cross...
متن کاملبررسی پایایی چک لیست ارگونومیکی ایالت واشنگتن با روش توافق بین مشاهدهکنندگان در دو گروه متخصص و غیرمتخصص در ارگونومی
Background & Objectives: Assessment of physical risk factors related to musculoskeletal disorders is performed by different methods including observational methods. Validity of these methods are important in workplaces. The purpose of this study was to investigate the reliability of the Washington States ergonomics checklist as an observational method. Methods: This descriptive-analytic...
متن کاملNurse-Physician Agreement on Triage Category: A Reliability Analysis of Emergency Severity Index
Background and Objectives: MThe Emergency Severity Index (ESI) triage is commonly used in clinical settings to determine the patients’ emergency severity. However, the reliability of this index is not sufficiently explored. The present study examines the inter-rater reliability of ESI by comparing triage ratings as performed by nurses and physicians. Methods: This prospective cross-sectional st...
متن کاملDesigning and Compiling a Friendship-Oriented Leadership Model
Purpose: The purpose of this research is to design and compile a model of friendship-oriented leadership. The statistical population included a group of experts in the field of leadership and management and broadcasting experts in the field of broadcasting. Methodology: In this research, a total of 20 people were selected as participants using a targeted sampling approach. The data was collect...
متن کاملتفسیر نتایج ایمنوهیستوشیمی بیان HER2/neu در سرطان مهاجم پستان: بررسی توافق بین مشاهدهکنندگان و در دو بار مشاهده توسط یک نفر
Background and Aim: The accurate determination of HER-2 in invasive breast cancer has become a critical issue, particularly in context of the results of Herceptin adjuvant therapy. The aim of this study was to evaluate inter- and intraobserver reproducibility of assessment of HER2/neu immunostaining in invasive breast cancerMaterials and Methods: This study was cross sectional and the conve...
متن کامل